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SUMMARY 


This report presents results of digital processing of aircraft-acquired 
Thematic Mapper Simulator (TMS) data collected during the winter season over 
a forested site in southern Mississippi. The goal of the research was to in- 
vestigate the utility of TMS data for use in forest inventories and monitoring. 
This study deals with one of four test sites selected for this research task 
under the AgRISTARS Renewable Resources Inventory Project. 

Analyses indicate that TMS data are capable of delineating the mixed forest 
land cover type to an accuracy of 92.5 percent correct. The accuracies associ- 
ated with river bottom forest and pine forest were 95.5 and 91.5 percent correct, 
respectively. These figures reflect the performance for products produced using 
the best subset of channels for each forest cover type. It was found that the 
choice of channels (subsets) had a significant effect on the accuracy of clas- 
sifications produced, and that the same channels are not the most desirable for 
all three forest types studied. Both supervised and unsupervised spectral 
signature development techniques were evaluated; the unsupervised methods proved 
unacceptable for the three forest types considered. 

INTRODUCTION 


This study was conducted as part of a research task under the AgRISTARS 
Renewable Resources Inventory (RRI) Project. The overall objectives of the re- 
search task are (1) to design and implement an efficient procedure for pro- 
cessing and analysis of Thematic Mapper (TM) digital data, and (2) to examine 
the potential utility of TM digital data in forest inventories and monitoring. 
To adequately evaluate the utility of TM digital data for forest management or 
inventorying/monitoring, analysis should not be restricted to any one major 
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forest ecosystem, but should include numerous forest types of major Importance 
to a user or user community. All results obtained would thus be indicative of 
the general nature of the forest resource, with specific problems or considera- 
tions related to each major forest type dealt with on a site-by-site level. 

Similarly, since the forest is a dynamic ecosystem with an annual biotic 
cycle, the effects of phonologic season on the results obtained from digital 
data analysis must be factored into the investigation. 

In order to take into account these considerations of variation in forest 
ecosystems and their response to the different seasons, four study sites were 
selected in locations indicated in Figure 1. These sites represent distinctly 
different forest ecosystems, each with unique environmental conditions and 
forest cover types. In addition, data collection was scheduled to occur at 
each site during each of the four major phonologic seasons (winter dormancy, 
spring leaf-out, summer growth, and fall leaf abscission). The calendar dates 
associated with each phonologic season were defined for each site independently, 
based on knowledge of the area. The dates selected included only that time 
frame during which the phonologic condition of the forest ecosystem remained 
somewhat stable and typified conditions representative for each season. 

Until Landsat D is launched (the last quarter of FY82 or first quarter of 
FY83) and the TM is available, this research task will employ data collected 
with an aircraft-borne TMS. 

This report deals specifically with results obtained from the analysis 
of winter TMS digital data collected over the Pearl River Basin study site in 
southern Mississippi. Subsequent reports will present results obtained for 
data collected for other seasons or for the other three study sites being used 
in this research. In addition, a comparison of results obtained from the 
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Landsat multi spectral scanner (MSS) data and TMS data Is In progress and will 
be Included In a subsequent report. 

PEARL RIVER BASIN STUDY SITE 

The Pearl River Basin Study Site (hereafter referred to as the MS site) Is 
a 43.2-km (27-m1) north-south area located In southern Mississippi (Figure 1), 
and Includes portions of Hancock County, MS, and Saint Tammany Parish, LA. This 
particular area was selected to represent the longleaf-slash pine and oak-gum- 
cypress forest types (reference 1) which occur throughout much of the South 
(Figure 2). The longleaf-slash pine type Is typified by the occurrence of long- 
leaf pine (Pinus Palustls Mill.) and slash pine (Pinus Elllottll Engelm.), but 
also Includes the other southern pines, oak, and gum. Species commonly found 
in the oak-gum-cypress type Include sweetgum (Liquidambar Styraciflua L.), 

Laurel oak (Quercus Laurlfolia Michx.), American Hornbeam (Carpinus Carol Iniana 
Walt.), American Holly (Ilex Opaca Ait.), Water oak (Q. Nigra L.) and Sweet 
Bay (Magnolia Virginlana L.) on the "drier" sites, and water tupelo (Nyssa 
Aquatica Marsh.), bald cypress (Taxodlum Distichum Rich.), and numerous species 
of ash (Fraxinus Spp.) on the "wetter" sites (often having standing water for 
long periods— reference 2). 

The topography of the MS site Is flat to gently rolling, with elevations 
ranging between 5 m (15 ft) In the south and 61 m (200 ft) in the north. The 
distribution of generalized land cover types within the MS site Is presented 
in Figure 3. The lower elevations encompass the major drainage basins of the 
Pearl River system In the south and the Hobolochitto River (with numerous 
branches) In the north (Figure 4). Most of the remaining areas, with the 
exception of one small city, are occupied by pine forests or have been cleared 
of all native vegetation and are currently supporting agricultural crops. 
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Figure 2. Typical Southeastern U.S. Forests 
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Figure 3. MS Site Land Cover Types 


6 




Figure 4. MS Test Site and Surrounding Area 
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The overall condition of the MS site during the winter season can be 
ascertained irom a careful study of Figure 5. This color infrared aerial 
photograph covers an area which is typical of the entire MS site and hence 
will be used as a reference for illustrating statements related to land cover 
conditions. 

All of the land in the MS site can be grouped into eight basic types. 

A brief discussion dealing with ea:h type is presented below to familiarize 
the reader with winter land cover conditions (letters in parentheses refer 
to areas marked in Figure 5): 

1. Inert Materials - This general land cover type contains such cover 
types as sand bars (D), bare soil/gravel pits (E), highways and dirt roads 
(F), and fallow agricultural fields. Parking lots, cities, buildings, etc., 
are also included in this land cover type. The physical condition of this 
land cover is, for the most part, independent of season, as little or no 
vegetative cover exists on most of these areas. However, rain does affect 
the condition of bare soil, causing it to become darker in color, and fallow 
agricultural fields will, in the other seasons, support crops. 

2. Winter Crops/Pasture - The predominant agricultural land use during 
the winter season in the MS site is pasture (H) or winter rye grass (I). 

Most other agricultural areas are fallow and have been included in the inert 
■naterials land cover type. Winter rye grass is very lush and green and, as 
can be seen on Figure 5, is represented by a bright pink/red on color infra- 
red photography. The pastures are in their characteristic dormant state, and 
appear brown to the naked eye. 

3. Old Fields - This land cover type includes areas which have been cut 
over and burned (J), or which represent the initial stages of the revegetaticm 
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Color IR Photograph of Area Typical of MS Test Site 


of abandoned agricultural fields and other cleared areas. Depending on the 
plant species which are invading a particular site, the condition of these 
sites may be quite variable. 

4. Marsh - Included in this land cover type are areas which are inundated 
by standing water for a majority of the year, and contain plant species typical 
of non-forested wetlands covering at least 10 percent of the surface area when 
viewed from above (K). The marsh is dormant during the winter season, and is 
quite uniform in color when viewed with the naked eye. Water is, in most in- 
stances, present as the substrate on which the dead, brown grass is standing. 

5. River Bottom Forest - This category includes forested areas (10 per- 
cent or more of the surface area covered by tree foliage when in full leaf 
conditions) that are seasonally flooded for prolonged periods (usually three 
months or more) or flooded as a result of diurnal tidal action directly or 
indirectly through water backup; 66-2/3 percent or more of the foliage cover 
is made up of the foliage of deciduous tree species when viewed from above 

in the full foliage condition (L). Depending on the understory species can- 
position and the occurrence of "evergreen" broadleafed tree species (e.g., 
live oak), the condition of the land cover type is somewhat variable. Also, 
the amount and condition of standing water when viewed through the vegetation 
will influence the overall spectral condition of this cover type, especially 
in the dormant, leafless condition of winter. 

6. Coniferous Forest - This includes forested areas with at least 66-2/3 
percent of their foliage occupied by coniferous tree species when viewed from 
above (N). This includes both natural (unraanaged) stands as well as planta- 
tions (managed) of coniferous tree species not associated with the river 
bottom land cover type. Bald cypress and spruce pine (Pinus glabra Walt.) 
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are not included in this land cover type, since they are considered a natural 
component of river bottom forest. 

7- Mixed Forest - This land cover type includes forested areas with 
neither river bottom nor coniferous forest types constituting 66-2/3 percent 
of the foliage cover when viewed from above in the full foliage condition 
(M). This land cover type is quite variable in composition, both in terms 
of overstory species diversity and density of stocking. In addition, where 
mixed stands do exist, a diversity of understory species which varies by 
site adds to the overall complexity of the site as seen from above. 

8. Water - Most of the water occurring in the MS site exists in one 
of three conditions, depending to a large degree on the turbidity of the 
water body in question. "A" in Figure 5 represents water located in a very 
shallow, sandy borrow pit which, due to the action of wind, waves, and rain, 
is extremely turbid; "B" represents water which is moderately turbid (the 
West Pearl River in this example); and "C" depicts very clear water, either 
shallow or deep. It is not always obvious as to which condition should be 
assigned to a particular body of water, however. As can be seen in Figure 
5, a majority of water is in small, widely scattered ponds, as well as 
narrow, meandering streams and rivers. In add'»tion, water is present in 
the understory of much of the river bottom forest area. 

THEMATIC MAPPER SIMULATOR DATA 

Data used in this study were obtained by an airborne TMS. The TMS was 
designed to produce data with spectral and spatial characteristics identical 
to those of the Thematic Mapper on Landsat-D, scheduled for launch later in 
FY82. This sensor will have spectral resolution as shown in Figure 6, with 
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Figure 6. Spectral Wavelength Characteristics of IMS and 
Landsat MSS Systems 
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30-m (100-ft) spatial resolution in all channels except channel 7, which will 
have 120-m (394-ft) resolution. Figure 6 also presents the spectral resolu- 
tion of currently available Landsat MSS data, as well as a generalized green 
leaf reflectance curve for comparison of the two satellite systems. 

Data collected by the IMS are subsequently converted from the analog format 
produced by the scanner to an 8-bit (256 levels of grey) digital format for 
use in data processing and analysis activities. The TMS, with 2.5-millirad1an 
aperture, is configured in such a way as to encompass a 50-degree field of 
view on either side of nadir. Assuming optimal conditions of a perfectly 
flat target surface and no scan line overlap at nadir, the dimensions of a 
pixel (at 50 degrees of scan from nadir) would be: 

width (across track) = 73.96 m (243 ft) 

length (nadir side) = 46.87 m (154 ft) 

length (extreme side) = 46.96 m (154 ft) 

These dimensions are calculated for an aircraft altitude which would result 
in a 30 X 30-m (100 x 100-ft) pixel at nadir, the normal Instantaneous Field 
of View (IFOV) for the TM satellite system. In addition, the overall length 
of atmosphere through which electromagnetic energy would have to travel across 
track in order to be measured by the detector is, at 50 degrees, 1.56 times 
that at nadir. It should be obvious that these conditions are extreme enough 
to preclude processing and analysis at such large angles of look. These geo- 
metric considerations are not as critical at smaller angles of look, and at 
or near 30 degrees on either side of nadir, the dimensions of the pixels be- 
come acceptably close to 30 m (100 ft). Therefore, data processing and 
analysis were restricted to 30 degrees on either sicte of nadir, which re- 
sulted in a data set containing 1184 scan lines and 418 elements (209 ele- 
ments on either side of nadir). 
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TMS data were collected on February 11, 1981, frwn an altitude of 12,000 
m (39,370 ft) above mean terrain elevation. As noted previously, this re- 
sulted In a spatial resolution (at nadir) of 30 x 30 m (100 x 100 ft) for 
channels 1 through 6, and 120 x 120 m (394 x 394 ft) for channel 7. Note 
that TMS channels are numbered In the sequence that they occur In the 
electromagnetic spectrum, and are not the same nunfeer assignment as for the 
TM. The data were viewed on an Image display device, and examined for radio- 
metric fidelity and the presence of abnormal data values (detector noise, 
drop outs, loss of synchronization, etc.). It was determined at this time 
that TMS channel 6 contained an unacceptably high amount of streaking In 
the data, traced to problems with the detector used In the scanner. Such 
problems appeared on a black and white display of the data as "comb" marks 
running across the data scan lines. The coefficient of variation for chan- 
nel 6 was not found to be markedly different from the other channels (Table 
1), and an evaluation of a histogram for channel 6 showed no aberrant behavior 
Therefore, due to the amount of noise present, channel 6 was removed from 
this study. 

The spatial resolution of channel 7, being four times that of the re- 
maining six channels, resulted in the seventh channel containing only one- 
sixteenth the number of pixels as the other six channels. This situation 
was rectified by expanding the data for channel 7 In order to fill up the 
entire file. Thus, each "pixel" in channel 7 was repeated three times In 
both the scan line and element directions, resulting in a block of data 
(four by four pixels in size) containing one radiometric value for all 16 
pixels. The value assigned to tf» pixels was that of the initial channel 
7 pixel which was expanded. In this manner, a channel -to-channel registra- 
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tion with the other six channels was effected, while at the same time pre- 
serving the ^ometric relationship of channel 7 to the other channels (four 
to one). 

When all problems had been corrected, the center 60 degrees of the data 
were examined for sun angle/angle of look related trends. No such probl«ns 
existed with this data set, as the aircraft data collection flight (oriented 
north/south) occurred within one half hour of local solar noon. Figures 7 
through 12 show a representative portion of the test site with images pro- 
duced from TMS data from each of the six channels of TMS data used for this 
study. 

GROUND TRUTH 

In orcter to establish the level of performance of TMS digital data, a 
network of 136 ground truth sites was established, against which results of 
the analysis of TMS data could be compared. The ground truth sites were 
selected to represent the major land cover types of Interest within the MS 
site. Numerous ground truth sites were taken for each land cover type of 
interest to insure the statistical reliability of results obtained. 

Each ground truth site was visited in the field, and a detailed survey 
was made of the land cover type present. Using photographs taken by field 
personnel, as well as the written descriptions completed for each site 
visited, all sites were assigned to one of the eight basic land cover types. 
Sites retained for use after being visited in the field were transferred onto 
a small scale vertical photograph. Ground truth corresponding to these sites 
was placed into a ground truth book and filed for later use. 

DATA PROCESSING/ ANALYSIS 

The initial phase of data processing dealt with registering the ground 
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Figure 7. TMS Channel 1 Image 
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Figure 10. TMS Channel 4 Image 
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Figure 11. TMS Channel 5 Inane 
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Figure 12. IMS Channel 7 Image 
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Table 1. Statistical Parameters Defining the Raw Data Set Used in 
the Study 


IMS CHANNEL 

1 

2 

3 

4 

5 

6 

7 

MEANS 

37.81 

92.87 

85.43 

79.93 

67.72 

46.22 

105.21 

STANDARD 

DEVIATION 

6.53 

22.05 

29.96 

28.17 

29.54 

12.45 

14.45 

COEFFICIENT 
OF VARIATION 
(X lOOX) 

17.27 

23,74 

35.07 

35.24 

43.62 

26.94 

13.77 
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truth data to the six channels of IMS data already located on a data file. 

The most straightforward manner in which to accomplish this was to create 
an additional channel of data for ground truth information. In this manner 
aerial photograph -to-UTM coordinate digitization and registration of the 
TMS data to a map base would not be necessary, thus eliminating air photo- 
to-map base registration errors. 

Each ground truth site was located in the TMS data using aerial photo- 
graphs collected concurrently with the digital data. Using an image display 
device, a polygon was outlined in the preprocessed TMS data which contained 
the TMS digital data corresponding to the site visited by field personnel. 
Polygon scan line/element corner coordinates corresponding to each ground 
truth site were stored for later use. 

Upon completion of polygon selection, field sheets correspond. ng to 
each ground truth site/polygon pair were examined, and each polygon was 
given a unique identification value based on the major land cover type to 
which it was assigned. This value would serve as a cross reference between 
ground truth and TMS data, and help to make subsequent data analysis some- 
what easier. 

The "ground truth" channel of the data file containing the six-channel 
TMS data set was then initialized to contain zeros in all pixels. The coor- 
dinates for each polygon were recalled, one polygon at a time, and the value 
for all pixels in the ground truth channel within the boundary of the polygon 
being used was set to that of the corresponding polygon identification value. 
The processing of all polygons resulted in a channel of information containing 
solid polygons whose values correspond to those of the ground truth sites. 
Remaining pixels (background) retained the value of zero. By encoding the 
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groind truth channel in this fashion, results obtained could be analyzed on 
either a general land cover basis (by grouping) or, if desired, on a polygon- 
by- polygon basis. 

The spectral homogeneity of all polygons was then examined for each of 
the six IMS channels. This was necessary in order to prevent misplacement 
of polygon corners from including dissimilar land cover types in the polygon. 

For example, several ground truth sites were located next to roads and ad- 
jacent to agricultural fields. Should the polygon be in error in such a 
manner that pixels from these land cover types were included in the polygon, 
evaluation of accuracy might lead to erroneous conclusions. Therefore, those 
pixels representing land cover types other than that intended for the polygon 
as a whole were edited out before the polygon was used. 

In addition, since several polygons were located through the use of 
only one channel of TMS data, polygon editing furnished the opportunity to 
examine all remaining channels of preprocessed data. Certain problems un- 
noticed on the particular channel used, but discernible on one or more other 
channels, were corrected as a result. 

Removal of the aberrant pixels (as determined through the use of polygon 
based histograms for each channel of data) was accomplished by simply zeroing 
them out in the ground truth channel of the combined data set, since all back- 
ground (unused) pixels had been assigned the value of zero. Completion of 
this editing process resulted in spectrally pure ground truth for use in the 
analysis of TMS data performance. 

SUPERVISED SPECTRAL SIGNATURE DEVELOPMENT 

Having completed preprocessing of the TMS digital data, as well as re- 
gistration (to the TMS data) and editing of ground truth polygons, the next step 
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In the investigation was to develop supervised spectral signatures for each of 
the edited ground truth polygons. Spectral signatures were developed through 
the use of software which uses a directional inctex table approach. This soft- 
ware can be instructed to examine one channel of data (a mask file) and to de- 
velop spectral values in the mask file. Thus, the ground truth channel was 
used as the mask file, and the software was instructed to ctevelop a spectral 
signature for the edited ground truth polygons. Since every ground truth 
polygon in the ground truth channel had been assigned a unique value, in- 
dividual spectral signatures were developed for each. This technique was 
required, since in some instances interior pixels of ground truth polygons 
had to be eliminated as described previously, while the original polygon 
boundary remained unchanged. Spectral signatures developed in this manner 
were stored in a disc file for subsequent use. 

SUBSET SELECTION 

The six-channel spectral signatures developed via the supervised ap- 
proach were used to gain an initial understanding of the utility of each TMS 
channel. This was accomplished through the implementation of a subset selec- 
tion technique. Average transformed divergence (reference 3), based on the 
supervised statistics developed from six channels of data, served as the 
vehicle for evaluation of overall performance of each subset of channels 
examined. As channels were removed, the resulting reductions in the aver- 
age transformed divergence were noted. The subset of channels with the 
smallest reduction in average transformed divergence was selected as the 
most desirable one. 

It is obvious that when using six channels of data, the number of sub- 
sets which must be examined would be quite large (68 in fact). In order to 
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circumvent this time consuming problem of examining 68 subsets » a technique 
using ordering of variables a..d properties of divergence was employed. This 
technique, detailed In reference 4, reduces the number of subsets which must 
be examined to a minimum. The results of subset selection are presented In 
Table 2. 

Reduction In average transformed divergence can be used to gain some in- 
sight into the relative utility of the channel being deleted. For Instance, 
the reduction In average transformed divergence for the five-channel subset 
resulting from the deletion of channel 1 was between 0.18 and 0.69 times that 
associated with the deletion of any other single channel. This Indicates 
that the information content of channel 1, useful In Improving the delineation 
of the land cover types in this study, is quite limited. Also, with such a 
small relative reduction, an investigator is more confident that this is the 
correct channel to delete to form the most useful five-channel subset. Had 
the reduction associated with the deletion of channel 1 been close to the re- 
duction values for one or more of the other channels, more careful considera- 
tion would be necessary before the five-channel subset was formed. 

This latter situation exists when considering the four-channel subset. 

In this instance, the reduction due to the elimination of channels 1 and 7 
was 0.84 times that of the next closest subset of four channels. Clearly, 
removing channels 1 and 7 is, by definition, the proper course of action, 
but since the actual reduction in average transformed divergence is so close 
to the reduction associated with the next nearest subset, the decision is 
not without reasonable doubt. Thus, by computing the ratio of the reduc- 
tion in average transformed divergence associated with the channels deleted 
to that of the next nearest subset of the same number of channels, a "con- 
fidence of decision" can be determined. High confidence implies that the 
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Table i' Most Desirable Subsets of the Six Channels of Winter 
TMS Data f^^in MS Study Site, Based on Average Transformed Dlver^nce 
Values 


NUMBER OF CHANNELS IN 
SUBSET 

TMS CHANNELS 

"CONFIDENCE" OF 
DECISION 

PERCENT OF TOTAL 
REDUCTION 

6 (original) 

1,2.3, 4.5, 7 

— 

— 

5 

2.3.4.5.7 

High 

0.100 

4 

2, 3.4.5 

Moderate 

0.410 

3 

2,4.5 

Moderate 

1.240 

2 

3.4 

Low 

3.880 

1 

4 

Low 

17.321 







ratio is small (less than 0.70); moderate confidence that the ratio is 
between 0.71 and 0.80; and low confidence that the ratio is between 0.80 and 
1.00. The confidence of decision values for the subsets selected are also 
presented in Table 2. 

The desirability of creating subsets of various numbers of channels can 
be estimated by examining the percent of total transformed divergence lost 
by using the various subsets. This figure is computed as 

Average Transformed Divergence (6 channels) ■ Average Transfonned Divergence (subset) 

Average Transformed Divert nee (6 channels) 

and is listed in the fourth column of Table 2. As expected, the percent re- 
duction increases dramatically with the deletion of additional channels. The 
reduction associated with the five-channel subset is quite small, suggesting 
that results obtained with five channels would not be significantly different 
from those derived from six channels. The four-channel value is still small, 
but intuitively seems to be significant, and some doubt exists as to whether 
a four-channel subset should be created, since two four-channel subsets pro- 
duced almost identical reductions in transformed divergence. 

In order to test the level to which channels can be deleted without 
significantly reducing the utility of classification results, digital clas- 
sifications were produced with a maxi mum- likelihood algorithm using the 
supervised spectral signatures corresponding to each subset listed in Table 2. 

RESULTS 

Accuracies (percent correct) for each classificati(M\ produced were es- 
tablished based on an independent set of grcxind truthed polygons not included 
in spectral signature development. 
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The accuracy associated with water was, at this point In the analysis 
found to be quite low (5.86 percent). All errors encountered were the re> 
suit of water pixels remaining unclassified. Unclassified pixels were 
treated as errors In this study, since spectral refinement and editing 
had been performed on all ground truthed areas used In this Investigation, 
Including areas used for accuracy evaluation. This suggested that the 
turbidity-associated variance present with water was not adequately re- 
presented by ground truthed areas. This situation can be seen by careful 
examination of Figure 5. 

Additional spectral signatures were established using channel 5 to de 
fine water/non-water, and were added to the spectral signatures previously 
developed. Channel 5 was used since spectral reflection in this region 
of the electromagnetic spectrum Is very low, and all water, no matter how 
turbid, appeared very dark on a black and white display of the data. This 
action corrected the deficiency noted. Two-factor analysis of variance 
was then used to determine If significant differences (at the 95 percent 
level of confidence) existed between the overall results obtained from any 
of the subsets of channels used and those obtained from all six channels. 
The results of two-factor analysis of variance are presented In Table 3. 

Column one of Table 3 presents the overall percent correct figures 
associated with each of the subsets listed in Table 2. These values were 
then subjected to the arcsln-^ transformation and the differences In 
transformed accuracies for each comparison listed in column two of Table 
3 were computed (column three). Critical values were then established, 
and the differences were compared to them. Overall accuracies were deter- 
mined to be significant If the difference In transformed accuracies be- 
tween two subsets was greater than the critical value. 
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I Table 3. NeMnan-Keuls Test Results of Two-Factor Analysis of 

Variance for Overall Results of Digital Data Classifications for 
} Pearl River Basin TMS Data Set 


Untransformed 


Accuracy 
(overall ) 

Subsets 

Compared 

Difference In 
Transformed Accuracy 

Critical 

Value 

Significant 
0.95 (**yes) 

6 ch 91.13 

6 vs 5 

0.792 

1.521 



6 vs 4 

1.217 

1.818 



6 vs 3 

5.451 

1.521 

* 


6 vs 2 

9.627 

1.818 

* 


6 vs 1 

34.369 

1.993 

* 

5 ch 91.90 

5 vs 4 

0.425 

1.521 



5 vs 3 

6.243 

1.818 

* 


5 vs 2 

10.429 

1.993 

* 


5 vs 1 

38.304 

2.116 

* 

4 ch 92.30 

4 vs 3 

6.668 

1.993 

* 


4 vs 2 

10.854 

2.116 

* 


4 vs 1 

38.304 

2.211 

* 

3 ch 85.01 

3 vs 2 

4.186 

1.521 

it 


3 vs 1 

28.917 

1.818 

* 

2 ch 87.06 
1 ch 38.30 

2 vs 1 

24.732 

1.521 

* 






In this case, no significant difference was found to exist between over- 
all results obtained frwn the four, five, and six-channel sid)sets. Howewr, 
the one, two, and three-channel subsets were not only significantly different 
from the four, five, and six-channel subsets, but were significantly different 
from each other. This leads to the conclusion that based on overall results, 
no fewer than four channels of data should be used, as results are signifi- 
cantly compromised with three or fewer channels. 

The results of the two-factor analysis of variance were also used to de- 
termine whether or not significant differences existed between classification 
results obtained for each of the land cover types for the six subsets of 
channels used. Results of this portion of the analysis of variance are suima- 
rlzed In Table 4. 

The bars In Table 4 underline the best (highest percent correct classi- 
fication) subsets for each land cover type which were found to be not signi- 
ficantly different at tfw 95 percent level of confidence. Thus, for any 
given land cover type, any of the subsets of channels underlined may be used 
with equally useful results. This, in effect, defines the reduction In 
dimensionality capable of being achieved for each land cover type listed. 

It Is of Interest to note that no single subset of channels Is under- 
lined for all land cover types in Table 4. This Implies that the results 
obtained for a given land cover type may be significantly con^romised by 
the selection of channels. If such a selection were made cm a "best overall" 
basis. Thus, based on Table 4, selecting all six channels would reduce the 
percent correct values for hay/grass and river bottom forest, while at the 
same time not significantly affecting the overall performance achieved for 
the classification as a whole. 

Percent correct classification figures associated with each land cover 
type as well as for the classification as a whole are also presented In 
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Tible 4. Results of Analysis of Variance ShoMing Statistically Non- 
significant Subsets of TMS Data which Produced the Best Results for Each 
Land Cover Type (a >0.05, Winter Pearl River Basin Data Set). Subsets 
Connected by a Bar were Non-significant. Ntanbers Represent Percent 
Correct Values Associated with each Land Cover Type/Subset Combination. 


LAND COVER 

TMS CHANNELS USED 
1»2,3»4,5,7 2, 3,4,5, 7 

2.3, 4,5 

2.4.5 

3.4 

4 

Inert 

96.19 

96.19 

96.67 

97.62 

95.24 

35.71 

Hay/6rass 

89.16 

95.18 

94.58 

94.58 

97.59 

0 

Old Fields 

91.95 

89.93 

71.14 

63.09 

47.65 

0 

Marsh 

89.29 

89.29 

89.29 

75.00 

64.29 

0 

River Bottwn 

86.41 

91.29 

95.54 

76.98 

87.73 

63.79 

Mixed Forest 

92.54 

90.15 

85.37 

82.69 

80.34 

55.22 

Pine 

89.29 

89.03 

91.58 

88.52 

71.30 

16.58 

Water 

100.00 

94.59 

95.24 

99.35 

96.91 

20.35 

Overall 

91.13 

91.90 

92.30 

85.01 

87.06 

38.42 
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Table 4. For old fields* marsh* and mixed forest* the accuracies decrease 
with each successive elimination of a channel of information. This li^jlles, 
where significant differences exist, that the particular channel deleted 
contained Information which made a significant contribution to the delinea- 
tion of the land cover type In question. 

For the remaining land cover types, as well as the overall value* 
accuracies tend to Increase Initially* and finally drop off as additional 
channels are deleted. This apparently paradoxical situation resulted from 
pixels which were unclassified In the high dimensional cases (and hence 
tabulated as "errors") being correctly classified In reduced dimensions 
(thus increasing the percent correct values presented In Table 4.) No pixels 
were found which had been classified Into other land cover types In the higher 
high dimensions and which* upon deletion of channels, were correctly clas- 
sified In reduced data space. The conclusion here Is that all of the 
variability of these land cover types was not represented In the supervised 
spectral signatures developed. Of course, this Is a potential drawback to 
any supervised approach. In any event* reducing the dimensionality In 
cases such as these placed less restrictive limits on classification of 
pixels Into land cover classes. A point Is reached, however, beyond which 
classification performance Is adversely affected with continuing deletion 
of channels, as Is evidenced by Table 4. 

Nonetheless, the reduced channel subsets performed acceptably well (in 
ili cases at or above the 89 percent correct level) for at least one subset 
of channels. Specific results obtained begin to show significant degrada- 
tion at various levels of channel reduction, depending wi land cover; 
but when viewed over all land cover types, no fewer than four channels 
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should be used If concerned with the eight land cover types descrlt^d In this 
report. Had river bottom forest behaved In a manner similar to mixed forest, 
the decision would be to use five channels as a minimum. 

Recall that earlier In this report there was some reasonable doubt about 
forming the four-channel subset, based on a study of the reduction in average 
transformed divergence. In this case, the best four-channel subset had a ratio 
of 0.84 that of the next best subset of four channels (comparing the reductions 
In transformed divergence), which Indicates that the utility of the best four- 
channel subset Is only iroderately better than the next best subset. Analysis of 
variance determined that In only one Instance (river bottom forest) was a 
significant Improvement realized by cn?^t1ng a four-channel subset, but that 
1n two cases (old fields and mixed forest) a significant ctecrease In perform- 
ance was noted. Other land cover types were not significantly affected. The 
choice, then, of creating a four-channel subset depends on the land cover type 
of interest; but based on all eight land cover types examined, the overall 
solution would be to use the five-channel subset. This decision reinforces 
the use of the relative reduction In transformed divergence as an Indicator 
of the desirability of further channel reduction in the gerwration of subsets. 

Of particular interest In this Investigation is the performance of THS 
data relative to the forest land cover types. In all three cases, percent 
correct classification was In the range of 85-95, depending on the subset 
selected. For pure stands of tin*>er the four-channel subset (TKS channels 
2, 3, 4, and 5) performed best for river bottom forest and was one of the 
non-sign1f leant best for pine (Table 4). However, when dealing with areas 
containing the attributes of both (i.e., mixed forest), at least one addi- 
tional channel of information is needed in order to iichieve the statisti- 
cally best perfonnance. Channel 7 was determined to be the next most useful 
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channel to add» and Its addItIcMi to channels 2, 3, 4, and S produced a sign- 
ificant improvement in the percent correct value associated with mixed forest 
{Table 4). In this specific case, the addition of channel 7 reduced the con- 
fusion between mixed forest and pine, resulting In the correct classification 
of pixels which, in the four-channel subset, were classified as pine. Al- 
though the range of mean values for pine in channel seven (88.16 - 121.98) 
overlapped the range for mixed forest (88.64 - 96.39), the statistics of the 
piiw classes which were confused with mixed forest in the four-channel sub- 
set were all located at the upper portion of the range of pine, and out of 
the range of mixed forest on channel 7. Thus, by adding channel 7, the con- 
fusion was significantly resolved. 

ALTERNATE APPROACHES TO SPECTRAL SIGNATURE DEVELOPfCNT 

In addition to the supervised spectral signature development approach 
already mentioned, several unsupervised technlqi ’- • were also examined, using 
ti^ evaluation procedure already outlined for the supervised metnod. Funda- 
iTientally, unsupervised spectral signature development differs from supervised 
techniques in that unsupervised techniques "scan" the entire data set and, 
within limits established by the investigator, develop spectral signatures de- 
fining spectral cover types without prior knowledge of the land cover types 
which are cental rwd within the data set. It then becomes a ivtter of re- 
lating the spectral signatures developed to land covers present, using aerial 
photographs, ground truth, etc. Once the spectral signature/land cover re- 
lationships have been established, evaluation of performance can be conducted. 

Of the techniques for unsupervised spectral signature development to be 
famd In the literature, two basic types were selected for Inclusion In this 
investigation: the sliding windw approach and point clustering. These two 
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were selected since they represent two basic approaches to unsupervised tech- 
niques, and because software (ELAS - reference 5) already existed for them on 
the computer system being used for this study. The NSTL/ERL ELAS computer pro- 
gram SRCH (SEARCH) is an unsupervised sliding window type approach to spectral 
signature development. A three by three pixel window is moved through the data 
set and, based on numerous parameters defining the operational limits of the 
software, spectral signatures are developed defining the spectral composition 
of the data use 

A modified versvjn of SRCH, called SVCP (SRCH - variable channel para- 
meters), permits the user to define spectral homogeneity indepenctently for 
each channel of input data used. Since this would permit more reasonable 
assumptions to be made with respect to the relationships between the chan- 
nels of input data, it was also included in the study. 

Point clustering, another ELAS computer program (PTCL), employs tech- 
niques to develop spectral signatures by examining individual pixels of data. 
The frequency of sampling is input by the user. As each point is examined, 
a decision is made as to whether or not the new pixel is spectrally similar 
to points already examined. If it is, it is grouped with the similar pixel (s). 
If not, it remains as a separate spectral signature, and the next pixel in 
the data file is examined. The process continues until all data have been 
processed. 

Various parameter settings of all of the jnsupervised software were 
tried, and the results were compared to those obtained from the supervised 
approach discussed earlier. In all cases, the results obtained from the 
supervised approach were significantly better than the unsupervised approaches. 
This does not mean that unsupervised techniques will not perform as well as 
the supervised techniques for specific land cover types, but ratHer that. 
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when considered over the forest vegetation in this test site, the unsuper- 
vised techniques did not perform as well in their separation into the three 
classes employed. For instance, SRCH produced very high accuracy values for 
water, hay/grass, inert materials, river bottom forest, and pine, but did not 
develop any spectral signatures defining the mixed forest; thus, based on the 
objective of this investigation to deal specifically with the forest resource, 
SRCH failed to perform to the same level as the supervised approach. 

It is of interest to note that no matter which of the techniques is used, 
ground truth polygons must be established to relate the spectral signatures to 
land cover types, and (using an independent set of polygons) to evaluate per- 
formance of the final results produced. Such areas must be spectrally homo- 
geneous in order to prevent the introduction of error into the experiment. 
Since this is the case, the work required to incorporate ground truth into 
the data analysis framework is tire same for supervised and unsupervised 
approaches. 

CONCLUSIONS 


As a result of the analysis of IMS data collected for the MS site in 
winter, the following conclusions can be made: 

1. IMS data processed with supervised spectral signature development 
techniques can produce land cover classifications for inert, hay/grass, old 
fields, marsh, river bottom forest, mixed forest, pine forest, and water cover 
types at an overall accuracy of 92.3%, and for the three forest cover types 

of pine, mixed, and river bottom at accuracies of 91.5%, 92.5%, and 95.5% 
respectively. 

2. Overall classification performance is affected by the nurrt>er of chan- 
nels used, and no fewer than four channels should be used (IMS channels 2, 3, 
4, and 5). 
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3. Specific land cover results are affected by choice of channels 
used (subset feature reduction), and the choice of subset which maximizes 
the results obtained for one land cover type may adversely affect the re- 
sults obtained for another land cover type. 

4. The three forest types which predominate the MS site can be 
adequately delineated by use of supervised techniques. Unsupervised tech- 
niques, while producing results which were very good in other land cover 
types, could not delineate all three forest types with the same level of 
performance as the supervised technique. 

It should be noted that the above conclusions relate to the eight 
general land cover types that were defined for this study. Additional 
research is being conducted in this test site to determine the capability 
to discriminate more detailed forest cover information (e.g. , species, 
density, understory) with IMS data. 
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